Goto

Collaborating Authors

 drum kit


The Inverse Drum Machine: Source Separation Through Joint Transcription and Analysis-by-Synthesis

Torres, Bernardo, Peeters, Geoffroy, Richard, Gael

arXiv.org Machine Learning

--We present the Inverse Drum Machine (IDM), a novel approach to Drum Source Separation that leverages an analysis-by-synthesis framework combined with deep learning. Unlike recent supervised methods that require isolated stem recordings, our approach operates on drum mixtures with only transcription annotations. IDM integrates Automatic Drum Transcription and One-shot drum Sample Synthesis, jointly optimizing these tasks in an end-to-end manner . By convolving synthesized one-shot samples with estimated onsets, akin to a drum machine, we reconstruct the individual drum stems and train a Deep Neural Network on the reconstruction of the mixture. Experiments on the StemGMD dataset demonstrate that IDM achieves separation quality comparable to state-of-the-art supervised methods that require isolated stems data, while significantly outperforming matrix decomposition baselines. N Western popular music, the rhythmic foundation typically relies on percussion instruments from a standard drum kit comprising kick drum, snare drum, and hi-hat, while additional elements such as cymbals, tom-toms, and auxiliary percussions provide timbral complexity and rhythmic variation. Music producers and engineers often need to adjust individual drum instruments separately for remixing, rebalanc-ing, effects processing, or creating educational materials [1], [2]. Ideally, music production would utilize isolated recordings of each drum instrument (known as "stems"), allowing for precise control during mixing. However, these instruments are usually played simultaneously and by the same performer, resulting in recordings in which all elements are mixed into a single audio stream. Obtaining these separated stems during recording requires multiple microphones (leading to microphone bleeding) or asking musicians to play in unnatural conditions [3]. The need for tools that can extract individual drum stems from already mixed recordings has led to growing interest in Drum Source Separation (DSS). These solutions, however, are proprietary and still have limitations in separation quality and flexibility. DSS is challenging due to the acoustic properties of percussion sounds.


Toward Deep Drum Source Separation

Mezza, Alessandro Ilic, Giampiccolo, Riccardo, Bernardini, Alberto, Sarti, Augusto

arXiv.org Artificial Intelligence

In the past, the field of drum source separation faced significant challenges due to limited data availability, hindering the adoption of cutting-edge deep learning methods that have found success in other related audio applications. In this manuscript, we introduce StemGMD, a large-scale audio dataset of isolated single-instrument drum stems. Each audio clip is synthesized from MIDI recordings of expressive drums performances using ten real-sounding acoustic drum kits. Totaling 1224 hours, StemGMD is the largest audio dataset of drums to date and the first to comprise isolated audio clips for every instrument in a canonical nine-piece drum kit. We leverage StemGMD to develop LarsNet, a novel deep drum source separation model. Through a bank of dedicated U-Nets, LarsNet can separate five stems from a stereo drum mixture faster than real-time and is shown to significantly outperform state-of-the-art nonnegative spectro-temporal factorization methods.


Serato Studio helps simplify the path to music production

Engadget

It's a safe assumption that most DJs have the itch to create some music of their own. Obviously many of them do, but for some, the expense of new gear or the learning curve involved with the software hinders that quest. To help bridge the gap, makers of the popular Serato DJ software are releasing a new product: Serato Studio (macOS/Windows). For those who already use the company's DJ hardware and software, things will be pleasantly familiar, helping ease you into the song-making process. Of course, you don't need to be a DJ; this tool is great for anyone who wants to make music with fewer "technical roadblocks" and more creative flow.


Oomm-tsss, oomm-tsss, Oomm-tsss, oomm-tsss... it's an AI beatbox

#artificialintelligence

Nao Tokui – a visiting associate professor at Kyushu University in California and a CEO of Qosmo, an AI and music startup – has developed a neural-network-based system that collects about 20 seconds of any sound to produce a custom drum kit, and then automatically sequences rhythms using those utterances and noises. Any snippet of audio can be used as input, from your own voice to improvised percussion. In a video demo of the JavaScript-based code, Tokui gently slaps his cheek, and flicks a plastic bottle. The sounds are recorded by his computer's microphone, and fed into the software to generate a rhythm from the audio: Whatever's recorded by the code is automagically split and assigned to the instruments that make up the virtual drum kit, such as the kick drum, snare, hi hat, and tom-toms. After all this, the model strings together combinations of the kit's components into a sequence to produce a loop that you can bop your head to.